83 research outputs found

    Integration of Data Mining into Scientific Data Analysis Processes

    Get PDF
    In recent years, using advanced semi-interactive data analysis algorithms such as those from the field of data mining gained more and more importance in life science in general and in particular in bioinformatics, genetics, medicine and biodiversity. Today, there is a trend away from collecting and evaluating data in the context of a specific problem or study only towards extensively collecting data from different sources in repositories which is potentially useful for subsequent analysis, e.g. in the Gene Expression Omnibus (GEO) repository of high throughput gene expression data. At the time the data are collected, it is analysed in a specific context which influences the experimental design. However, the type of analyses that the data will be used for after they have been deposited is not known. Content and data format are focused only to the first experiment, but not to the future re-use. Thus, complex process chains are needed for the analysis of the data. Such process chains need to be supported by the environments that are used to setup analysis solutions. Building specialized software for each individual problem is not a solution, as this effort can only be carried out for huge projects running for several years. Hence, data mining functionality was developed to toolkits, which provide data mining functionality in form of a collection of different components. Depending on the different research questions of the users, the solutions consist of distinct compositions of these components. Today, existing solutions for data mining processes comprise different components that represent different steps in the analysis process. There exist graphical or script-based toolkits for combining such components. The data mining tools, which can serve as components in analysis processes, are based on single computer environments, local data sources and single users. However, analysis scenarios in medical- and bioinformatics have to deal with multi computer environments, distributed data sources and multiple users that have to cooperate. Users need support for integrating data mining into analysis processes in the context of such scenarios, which lacks today. Typically, analysts working with single computer environments face the problem of large data volumes since tools do not address scalability and access to distributed data sources. Distributed environments such as grid environments provide scalability and access to distributed data sources, but the integration of existing components into such environments is complex. In addition, new components often cannot be directly developed in distributed environments. Moreover, in scenarios involving multiple computers, multiple distributed data sources and multiple users, the reuse of components, scripts and analysis processes becomes more important as more steps and configuration are necessary and thus much bigger efforts are needed to develop and set-up a solution. In this thesis we will introduce an approach for supporting interactive and distributed data mining for multiple users based on infrastructure principles that allow building on data mining components and processes that are already available instead of designing of a completely new infrastructure, so that users can keep working with their well-known tools. In order to achieve the integration of data mining into scientific data analysis processes, this thesis proposes an stepwise approach of supporting the user in the development of analysis solutions that include data mining. We see our major contributions as the following: first, we propose an approach to integrate data mining components being developed for a single processor environment into grid environments. By this, we support users in reusing standard data mining components with small effort. The approach is based on a metadata schema definition which is used to grid-enable existing data mining components. Second, we describe an approach for interactively developing data mining scripts in grid environments. The approach efficiently supports users when it is necessary to enhance available components, to develop new data mining components, and to compose these components. Third, building on that, an approach for facilitating the reuse of existing data mining processes based on process patterns is presented. It supports users in scenarios that cover different steps of the data mining process including several components or scripts. The data mining process patterns support the description of data mining processes at different levels of abstraction between the CRISP model as most general and executable workflows as most concrete representation

    Introducing the NLU Showroom: A NLU Demonstrator for the German Language

    Get PDF
    We present the NLU Showroom, a platform for interactively demonstrating the functionality of natural language understanding models with easy to use visual interfaces. The NLU Showroom focuses primarily on the German language, as not many German NLU resources exist. However, it also serves corresponding English models to reach a broader audience. With the NLU Showroom we demonstrate and compare the capabilities and limitations of a variety of NLP/NLU models. The four initial demonstrators include a) a comparison on how different word representations capture semantic similarity b) a comparison on how different sentence representations interpret sentence similarity c) a showcase on analyzing reviews with NLU d) a showcase on finding links between entities. The NLU Showroom is build on state-of-the-art architectures for model serving and data processing. It targets a broad audience, from newbies to researchers but puts a focus on putting the presented models in the context of industrial applications

    Potency of transgenic effectors for neurogenetic manipulation in Drosophila larvae

    Get PDF
    Genetic manipulations of neuronal activity are a cornerstone of studies aimed to identify the functional impact of defined neurons for animal behavior. With its small nervous system, rapid life cycle, and genetic amenability, the fruit fly Drosophila melanogaster provides an attractive model system to study neuronal circuit function. In the past two decades, a large repertoire of elegant genetic tools has been developed to manipulate and study neural circuits in the fruit fly. Current techniques allow genetic ablation, constitutive silencing, or hyperactivation of neuronal activity and also include conditional thermogenetic or optogenetic activation or inhibition. As for all genetic techniques, the choice of the proper transgenic tool is essential for behavioral studies. Potency and impact of effectors may vary in distinct neuron types or distinct types of behavior. We here systematically test genetic effectors for their potency to alter the behavior of Drosophila larvae, using two distinct behavioral paradigms: general locomotor activity and directed, visually guided navigation. Our results show largely similar but not equal effects with different effector lines in both assays. Interestingly, differences in the magnitude of induced behavioral alterations between different effector lines remain largely consistent between the two behavioral assays. The observed potencies of the effector lines in aminergic and cholinergic neurons assessed here may help researchers to choose the best-suited genetic tools to dissect neuronal networks underlying the behavior of larval fruit flies

    Mort a Venècia

    Get PDF
    La Facultat de Filosofia i Lletres de la UAB publica, des de principis del confinament pel Covid-19, una sèrie de píndoles en forma de breu article, sota el títol 'Llibres i música en temps de desassossec', on es convida al lector a conèixer diferents suggeriments per a la lectura o l'audició de música, que ajudin a millorar l'estat d'ànim i aportin coneixement en moments difícils i d'incertesa per a tots. A 'Llibres i música en temps de desassossec' es poden llegir textos de professors i professores de la FacultatText publicat com a notícia a la web de la Facultat de Filosofia i Lletres de la Universitat Autònoma de Barcelona el 27/03/202

    Dust Devil Tracks

    Get PDF
    Dust devils that leave dark- or light-toned tracks are common on Mars and they can also be found on the Earth’s surface. Dust devil tracks (hereinafter DDTs) are ephemeral surface features with mostly sub-annual lifetimes. Regarding their size, DDT widths can range between ∼1 m and ∼1 km, depending on the diameter of dust devil that created the track, and DDT lengths range from a few tens of meters to several kilometers, limited by the duration and horizontal ground speed of dust devils. DDTs can be classified into three main types based on their morphology and albedo in contrast to their surroundings; all are found on both planets: (a) dark continuous DDTs, (b) dark cycloidal DDTs, and (c) bright DDTs. Dark continuous DDTs are the most common type on Mars. They are characterized by their relatively homogenous and continuous low albedo surface tracks. Based on terrestrial and martian in situ studies, these DDTs most likely form when surficial dust layers are removed to expose larger-grained substrate material (coarse sands of ≥500 μm in diameter). The exposure of larger-grained materials changes the photometric properties of the surface; hence leading to lower albedo tracks because grain size is photometrically inversely proportional to the surface reflectance. However, although not observed so far, compositional differences (i.e., color differences) might also lead to albedo contrasts when dust is removed to expose substrate materials with mineralogical differences. For dark continuous DDTs, albedo drop measurements are around 2.5 % in the wavelength range of 550–850 nm on Mars and around 0.5 % in the wavelength range from 300–1100 nm on Earth. The removal of an equivalent layer thickness around 1 μm is sufficient for the formation of visible dark continuous DDTs on Mars and Earth. The next type of DDTs, dark cycloidal DDTs, are characterized by their low albedo pattern of overlapping scallops. Terrestrial in situ studies imply that they are formed when sand-sized material that is eroded from the outer vortex area of a dust devil is redeposited in annular patterns in the central vortex region. This type of DDT can also be found in on Mars in orbital image data, and although in situ studies are lacking, terrestrial analog studies, laboratory work, and numerical modeling suggest they have the same formation mechanism as those on Earth. Finally, bright DDTs are characterized by their continuous track pattern and high albedo compared to their undisturbed surroundings. They are found on both planets, but to date they have only been analyzed in situ on Earth. Here, the destruction of aggregates of dust, silt and sand by dust devils leads to smooth surfaces in contrast to the undisturbed rough surfaces surrounding the track. The resulting change in photometric properties occurs because the smoother surfaces have a higher reflectance compared to the surrounding rough surface, leading to bright DDTs. On Mars, the destruction of surficial dust-aggregates may also lead to bright DDTs. However, higher reflective surfaces may be produced by other formation mechanisms, such as dust compaction by passing dust devils, as this may also cause changes in photometric properties. On Mars, DDTs in general are found at all elevations and on a global scale, except on the permanent polar caps. DDT maximum areal densities occur during spring and summer in both hemispheres produced by an increase in dust devil activity caused by maximum insolation. Regionally, dust devil densities vary spatially likely controlled by changes in dust cover thicknesses and substrate materials. This variability makes it difficult to infer dust devil activity from DDT frequencies. Furthermore, only a fraction of dust devils leave tracks. However, DDTs can be used as proxies for dust devil lifetimes and wind directions and speeds, and they can also be used to predict lander or rover solar panel clearing events. Overall, the high DDT frequency in many areas on Mars leads to drastic albedo changes that affect large-scale weather patterns

    History and Applications of Dust Devil Studies

    Get PDF
    Studies of dust devils, and their impact on society, are reviewed. Dust devils have been noted since antiquity, and have been documented in many countries, as well as on the planet Mars. As time-variable vortex entities, they have become a cultural motif. Three major stimuli of dust devil research are identified, nuclear testing, terrestrial climate studies, and perhaps most significantly, Mars research. Dust devils present an occasional safety hazard to light structures and have caused several deaths
    corecore